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CHIMERIC GENE CONSTRUCTS FOR GENERATION OF FLUORESCENT 

TRANSGENIC ORNAMENTAL FISH 

FIELD OF THE INVENTION 

This invention relates to fish gene promoters and chimeric gene constructs with these 
5 promoters for generation of transgenic fish, particularly fluorescent transgenic ornamental 
fish. 

BACKGROUND OF THE INVENTION 

Transgenic technique involves the transfer of a foreign gene into a host organism 
enabling the host to acquire a new and inheritable trait. The technique was first developed 

10 in mice by Gordon et al. (1980). They injected foreign DNA into fertilized eggs and found 
that some of the mice developed fi-om the injected eggs retained the foreign DNA. 
Applying the same technique, Palmiter et al. (1982) have introduced a chimeric gene 
containing a rat grov^_hormone gene under a mouse heavy metal inducible gene promoter 
and generated the first batch of genetically engineered supermice, which are almost twice 

15 as large as non-transgenic siblings. This work has opened a promising avenue in using the 
transgenic approach to render animals new and beneficial traits for livestock husbandry and 
aquaculture. 

In addition to the stimulation of somatic growth for increasing the gross production 
of animal husbandry and aquaculture, the transgenic technique also has many other 

20 potential applications. First of all, transgenic animals can be used as a bioreactor to 
produce commercially useful compounds by expression of a useful foreign gene in milk or 
in blood. Many pharmaceutically useful protein factors have been expressed in this way. 
For example, the human al-antitrypsin, which is commonly used to treat emphysema, has 
been expressed at a concentration as high as 35 mg/ml (10% of milk proteins) in the milk 

25 of transgenic sheep (Wright et al., 1991). Similarly, the transgenic technique can also be 
used to improve the nutritional value of milk by selectively increasing the levels of certain 
valuable proteins such as caseins and by supplementing certain new and useful proteins 
such as lysozyme for antimicrobial activity (Maga and Murray, 1995). Second, transgenic 
mice have been widely used in medical research, particularly in the generation of 

30 transgenic animal models for human disease studies (Lathe and MuUins, 1993). More 
recently, it has been proposed to use transgenic pigs as organ donors for 
xenotransplantation by expressing human regulators of complement activation to prevent 
hyperacute rejection during organ transplantation (Cozzi and White, 1995). The 
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development of disease resistant animals has also been tested in transgenic mice (e.g. Chen 
et al., 1988). 

Fish are also an intensive research subject in the transgenic study. There are many 
ways of introducing a foreign gene into fish, including microinjection (e.g. Zhu et al., 
5 1985; Du et al., 1992), electroporation (Powers et al., 1992), sperm-mediated gene transfer 
(Khoo et al., 1992; Sin et al., 1993), gene bombardment or gene gun (Zelemin et aL, 1991), 
and liposome-mediated gene transfer (Szelei et al., 1994). The first transgenic fish report 
was published by Zhu et al. (1985) using a chimeric gene construct consisting of a mouse 
metallothionein gene promoter and a human growth hormone gene. Most of the early 

10 transgenic fish studies have concentrated on grov^ hormone gene transfer with an aim at 
generating fast grov^ng "superfish", A majority of early attempts used heterologous 
growth hormone genes and promoters and failed to produce gigantic superfish (e.g. 
Chourrout et al., 1986; Penman et al., 1990; Brem et al., 1988; Gross et al., 1992). But 
enhanced grov^ of transgenic fish has been demonstrated in several fish species including 

15 Atlantic salmon, several species of Pacific salmons, and loach (e.g. Du et al., 1992; Delvin 
et al., 1994,^995; Tsai et al., 1995). 

The zebrafish, Danio rerio, is a new model organism for vertebrate developmental 
biology. As an experimental model, the zebrafish offers several major advantages such as 
easy availability of eggs and embryos, tissue clarity throughout the embryogenesis, 

20. external development, short generation time and ..easy , maintenance of : both the. adult and 
the young. Transgenic zebrafish have been used as an experimental tool in zebrafish 
developmental biology. However, despite the fact that the first transgenic zebrafish was 
reported a decade ago (Stuart et al., 1988), most transgenic zebrafish work conducted so far 
used heterologous gene promoters or viral gene promoters: e.g. viral promoters fi-om SV40 

25 (simian virus 40) and RSV (Rous sarcoma virus) (Stuart et al., 1988, 1990; Bayer and 
Campos-Ortega, 1992), a carp actin promoter (Liu et al., 1990), and mouse homeobox gene 
promoters (Westerfield et al., 1992). As a result, the expression pattern of a transgene in 
many cases is variable and unpredictable. 

GFP (green fluorescent protein) was isolated fi-om a jelly fish. Aqueous victoria, 
30 The v/ild type GFP emits green fluorescence at a light length of 508 nm upon stimulation 
with ultraviolet light (395 nm). The primary structure of GFP has been elucidated by 
cloning of its cDNA and genomic DNA (Prasher et al., 1992). A modified GFP, also called 
EGFP for enhanced green fluorescent protein, has been generated artificially and it 
contains mutations that allow the protein to emit a stronger green light and its coding 
35 sequence has also been optimized for higher expression in mammalian cells based on 
preferable human codons. As a result, EGFP fluorescence is about 40 times stronger than 
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the wild type GFP in mammalian cells (Yang et al., 1996). GFP (including EGFP) has 
become a popular tool in cell biology and transgenic research. By fusing GFP with a tested 
protein, the GFP fusion protein can be used as an indicator of the subcellular location of 
the tested protein (Wang and Hazelrigg, 1994) . By transformation of cells with a 
5 functional GFP gene, the GFP can be used as a marker to identify expressing cells (Chalfie 
et al., 1994). Thus, the GFP gene has become an increasingly popular reporter gene for 
transgenic research as GFP can be easily detected by a non-invasive approach. 

The GFP gene (including EGFP gene) has also been introduced into zebrafish in 
several previous reports by using various gene promoters, including Xenopus elongation 

10 factor la enhancer-promoter (Amsterdam et al., 1995, 1996), rat myosin light-chain 
enhancer (Moss et al., 1996), zebrafish GATA-1 and GATA-S promoters (Meng et al., 1997; 
Long et al., 1997), zebrafish a- and ft-actin promoters (Higashijima et al., 1997), and 
tilapia insulin-like growth factor I promoter (Chen et al., 1998). All of these transgenic 
works aim at either developing a GFP transgenic system for gene expression analysis or at 

15 testing regulatory DNA elements in gene promoters. 

SUMMARY OF THE INVENTION 

It is a primary objective of the invention to clone fish gene promoters of skin 
specificity, muscle specificity . or. ubiquitous functioii. and to use these prompters to. deyelop. 
effective gene constructs for production of transgenic fish. 

20 It is another objective of the invention to develop fluorescent transgenic ornamental 

fish using these gene constructs. By applying different gene promoters, tissue-specific or 
ubiquitous, to drive the GFP gene, GFP could be expressed in different tissues or 
ubiquitously. Thus, these transgenic fish may be skin fluorescent, muscle fluorescent, 
ubiquitously fluorescent, or inducibly fluorescent. These transgenic fish may be used for 

25 ornamental purposes, for monitoring environmental pollution, and for basic studies such as 
recapitulation of gene expression programs or monitoring cell lineage and cell migration. 
These transgenic fish may be used for cell transplantation and nuclear transplantation or 
fish cloning. 

Other objectives, features and advantages of the present invention will become 
30 apparent from the detailed description which follows, or may be learned by practice of the 
invention. 

Three zebrafish gene promoters of different characteristics were isolated and three 
chimeric gene constructs containing a zebrafish gene promoter and EGFP DNA were 
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made: pCK-EGFP, pMCK-EGFP and pARP-EGFP. The first chimeric gene construct, CK- 
EGFP, contains a zebrafish cytokeratin (CK) gene promoter (2.2 kb) which is specifically 
or predominantly expressed in skin epithelia. The second one, MCK-EGFP, contains a 
muscle-specific promoter (1.5 kb) fi-om a zebrafish muscle creatine kinase (MCK) gene 
5 and the gene is only expressed in the muscle tissue. The third one, ARP-EGFP, contains a 
strong and ubiquitously expressed promoter fi-om a zebrafish acidic ribosomal protein 
(ARP) gene. These three chimeric gene constructs have been introduced into zebrafish at 
the one cell stage by microinjection. In all cases the GFP expression pattems were 
consistent with the specificities of the promoters. GFP was predominantly expressed in 
10 skin epithelia with pCK-EGFP, specifically expressed in muscles with pMCK-EGFP, and 
ubiquitously expressed in all tissues with pARP-EGFP. 

These chimeric genes will be useful to generate green fluorescent transgenic fish. 
The GFP transgenic fish emit green fluorescence light under a blue light and this feature 
makes the genetic engineered fish imique and attractive in the ornamental fish market. 
15 Meanwhile, the fluorescent transgenic fish are also useful as research models for 
embryonic studies such as cell lineage, cell migration, cell and nuclear transplantation etc. 

BPaEF DESCRIPTION OF THE DRAWINGS 

Figs. lA-lG are photographs showing expression of CK (Figs. lA-lC), MCK 
(Figs. ID- IE) and ARP (Figs. lF-lG) niRN As in zebrafish embryos as revealed by whole 

20 mount in situ hybridization (detailed description of the procedure can be found in Thisse et 
al., 1993). (Fig. lA) A 28 hpf (hour postfertilization) embryo hybridized with a CK 
antisense riboprobe. (Fig. IB) Enlargement of the mid-part of the embryo shown in Fig. 
lA. (Fig. IC) Cross-section of the embryo in Fig. lA. (Fig. ID) A 30 hpf embryo 
hybridized with an MCK antisense riboprobe. (Fig. IE) Cross-section of the embryo in Fig 

25 ID. (Fig. IF) A 28 hpf embryo hybridized with an ARP antisense riboprobe. (Fig. IG) 
Cross-section of the embryo in Fig. IF. Arrows indicate the planes for cross-sections and 
box in panel A indicates the enlarged region shown in panel B. 

Fig. 2 is a digitized image showing distribution of CK, MCK and ARP mRNAs in 
adult tissues. Total RNAs were prepared from selected adult tissues as indicated at the top 
30 of each lane and analyzed by Northern blot hybridization (detailed description of the 
procedure can be found in Gong et al., 1992). Three identical blots were made fi-om the 
same set of RNAs and hybridized with the CK, MCK and ARP probes, respectively. 

Fig. 3. is a schematic representation of the strategy of promoter cloning. Restriction 
enzyme digested genomic DNA was ligated wdth a short linker DNA which consists of 
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Oligo 1 and Oligo 2. Nested PGR reactions were then performed: the first round PGR used 
linker specific primer LI and gene specific primers Gl. where Gl is GKl, MGKl or ARPl 
in the described embodiments, and the second round linker specific primer L2 and gene 
specific primer G2, where G2 is CK2, MGK2 or ARP2, respectively in the described 
5 embodiments. 

Fig. 4 is a schematic map of the chimeric gene construct, pGK-EGFP. The 2.2 kb 
zebrafish CK promoter region is inserted into the pEGFP-1 (Clonetech) at the EcoPU and 
BamHI site as indicated. In the resulting chimeric DNA construct, the EOF? gene is under 
control of the zebrafish CK promoter. Also shown is the kanamycin/neomycin resistance 

10 gene (Kan^/Neo^ in the backbone of the original pEGFP-1 plasmid. The total length of the 
recombinant plasmid pGK-EGFP is 6.4 kb. 

Fig. 5 is a schematic map of the chimeric gene construct, pMGK-EGFP. The 1.5 kb 
zebrafish MCK promoter region is inserted into the pEGFP-1 (Clonetech) at the EcoRI and 
BamHI site as indicated. In the resulting chimeric DNA construct, the EGFP gene is under 
15 control of the zebrafish MCK promoter. Also shown is the kanamycin/neomycin resistance 

gene"(Kan^/Neo^ in the-backbone of the original pEGFP-1 plasmid. The total length of the 

recombinant plasmid pMCK-EGFP is 5.7 kb. 

Fig. 6 is a schematic map of the chimeric gene construct, pARP-EGFP. The 2.2 kb 
... ... zebrafish ARP promojter/1 st intron region is inserted mto. the pEGFP-l. .(Clonetech). at the 

20 EcoRI and BamHI site as indicated. In the resulting chimeric DNA construct, the EGFP 
gene is under control of the zebrafish ARP promoter. Also shown is the 

kanamycin/neomycin resistance gene (Kan^/Neo^ in the backbone of the original pEGFP-1 
plasmid. The total length of the recombinant plasmid pARP-EGFP is 6.4 kb. 

Fig. 7 is a photograph of a typical transgenic zebrafish fiy (4 days old) with pCK- 
25 EGFP, which emits green fluorescence from skin epithelia under a blue light. 

Fig. 8 is a photograph of a typical transgenic zebrafish fry (3 days old) with pMCK- 
EGFP, which emits green fluorescence fi-om skeletal muscles under a blue light. 

Fig. 9 is a photograph of a typical transgenic zebrafish fry (2 days old) with pARP- 
EGFP, which emits green fluorescence under a blue light fi-om a variety of cell t>'pes such 
30 as skin epithelia, muscle cells, lens, neural tissues, notochord, circulating blood cells and 
yolk cells. 



-6- 



DET AILED DESCRIPTION OF THE INVENTION 
Gene Constructs 

To develop successful transgenic fish with a predictable pattern of transgene 
expression, the first step is to make a gene construct suitable for transgenic studies. The 
5 gene construct generally consists of three portions: a gene promoter, a structural gene and 
transcriptional termination signals. The gene promoter would determine where, when and 
under what conditions the structural gene is turned on. The structural gene contains a 
protein coding region that would determine the protein to be synthesized and thus the 
biological function. The transcription termination signals consist of two parts: a 
10 polyadenylation signal and a transcriptional termination signal after the polyadenylation 
signal. Both are important to terminate the gene transcription. Among the three portions, 
selection of a promoter is very important for successful transgenic study, and it is 
preferable to use a homologous promoter (homologous to the host fish) to ensure accurate 
gene activation in the transgenic host. 

1 5 Recombinant DNA Constructs 

Recombinant DNA constructs comprising one or more of the DNA or RNA 
sequences described herein and an additional DNA and/or RNA sequence are also included 
within the scope of this invention. These recombinant DNA constructs usually have 
sequences which do not 'occur in hatiife or exist in a form that does not dccut in nature or 

20 exist in association with other materials that do not occur in nature. The DNA and/or RNA 
sequences described hereinabove are "operably linked" with other DNA and/or RNA 
sequences. DNA regions are operably linked when they are functionally related to each 
other. For example, DNA for a presequence or secretory leader is operably linked to DNA 
for a polyp>eptide if it is expressed as part of a preprotein which participates in the secretion 

25 of the polypeptide; a promoter is operably linked to a coding sequence if it controls the 
transcription of the coding sequence; or a ribosome binding site is operably linked to a 
coding sequence if it is positioned so as to permit translation. Generally, operably linked 
means contiguous (or in close proximity to) and, in the case of secretory leaders, 
contiguous and in reading phase. 

30 The sequences of some of the DNAs, and the corresponding proteins encoded by 

the DNA, which are useful in the invention are set forth in the attached Sequence Listing. 

The complete cytokeratin (CK) cDNA sequence is shown in SEQ ID NO:l, and its 
deduced amino acid sequence is shown in SEQ ID NO:2. The binding sites of the gene 
specific primers for promoter amplification, CKl and CK2, are indicated. The extra 
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nucleotides introduced into CK2 for generation of a restriction site are shown as a 
misc feature in the primer sequence SEQ ID NO:l I. A potential polyadenylation signal, 
AATAAA, is indicated in SEQ ID NO: 1 . 

The complete muscle creatine kinase (MCK) cDNA sequence is shown in SEQ ID 
5 N0:3, and its deduced amino acid sequence is shown in SEQ ID NO:4. The binding sites 
of the gene specific primers for promoter amplification, MCKl and MCK2, are indicated. 
The extra nucleotides introduced into MCK I and MCK2 for generation of restriction sites 
are shown as a misc_feature in the primer sequences SEQ ID NOS: 12 and 13, respectively. 
A potential polyadenylation signal, AATAAA, is indicated in SEQ ID NO:3. 

10 The complete acidic ribosomal protein PO (ARP) cDNA sequence is shown in SEQ 

ID NO:5, and its deduced amino acid sequence is shown in SEQ ID NO:6. The binding 
sites of the gene specific primers for promoter amplification, ARPl and ARP2, are 
indicated. The extra nucleotides introduced to ARP2 for generation of a restriction site are 
shown as a misc_feature in the primer sequence SEQ ID NO : 1 5 . A potential 

1 5 polyadenylation signal, AATAAA, is indicated in SEQ ID NO:5. 

SEQ ID NO:7 shows the complete sequence of the 2,2 kb CK promoter region. A 
putative TATA box is shown, and the 3' nucleotides identical to the 5' CK cDNA sequence 
are shown as a misc_feature. The binding site of the second gene specific primer, CK2, is 
shown. The introduced BamHI site is indicated as a misc_feature in the primer sequence 
20 SEQIDNOill. ' ' ' 

SEQ ID NO:8 shows the complete sequence of the 1,5 kb MCK promoter region. A 
putative TATA box is shown, and the 3* nucleotides identical to the 5* MCK cDNA 
sequence are shown as a misc_feature in SEQ ID NO:8. The binding site of the second 
gene specific primer, MCK2, is shown. The introduced BamHI site is indicated as a 
25 misc_feature in the primer sequence SEQ ID NO: 13. 

SEQ ID NO;9 shows the complete sequence of the 2.2 kb ARP promoter region 
including the first intron. The first intron is shown, and the 3' nucleotides identical to the 5' 
ARP cDNA sequence are shown as misc_features. No typical TATA box is found. The 
binding site of the second gene specific primer, ARP2, is shown. The introduced BamHI 
30 site is indicated as a misc feature in the primer sequence SEQ ID NO:l 5. 

Specifically Exemplified Polypeptides/DNA 

The present invention contemplates use of DNA that codes for various polypeptides 
and other types of DNA to prepare the gene constructs of the present invention. DNA that 
codes for structural proteins, such as fluorescent peptides including GFP, EGFP, BFP, 
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EBFP, YFP, EYFP, CFP, ECFP and enz\TOes such as luciferase, B-galactosidase; 
chloramphenicol acetyltransferase, etc. are useful in the present invention. More 
particularly, the DNA may code for polypeptides comprising the sequences exemplified in 
SEQ ID NOS:2, 4 and 6. The present invention also contemplates use of particular DNA 
5 sequences, including regulatory sequences, such as promoter sequences shown in SEQ ID 
NOS: 7, 8 and 9. Finally, the present invention also contemplates the use of additional 
DNA sequences, described generally herein or described in the references cited herein, for 
various purposes. 

Chimeric Genes 

10 The present invention also encompasses chimeric genes comprising a promoter 

described herein operatively linked to a heterologous gene. Thus, a chimeric gene can 
comprise a promoter of a zebrafish operatively linked to a zebrafish structural gene other 
than that normally found linked to the promoter in the genome. Alternatively, the 
promoter can be operatively linked to a gene that is exogenous to a zebrafish, as 

15 exemplified by the GFP and other genes specifically exemplified herein. Furthermore, a 
chimeric gene can comprise an exogenous promoter linked to any structural gene not 
normally linked to that promoter in the genome of an organism. 

Variants of Specifically Exemplified Polypeptide 

- : DNA which codes for variants of the specifically exemplified polypeptides EU*e also 

20 encompassed by the present invention. Possible variants include allelic variants and 
corresponding polypeptides firom other organisms, particularly other organisms of the same 
species, genus or family. The variants may have substantially the same characteristics as 
the natural polypeptides. The variant polypeptide will possess the primary property of 
concern for the polypeptide. For example, the polypeptide will possess one or more or all 
25 of the primary physical (e.g., color) and/or biological (e.g., enzymatic activity) properties 
of the specifically described polypeptide. DNA of the structural genes of the present 
invention will encode a protein that produces a fluorescent or chemiluminescent light 
under conditions appropriate to the particular polypeptide in one or more tissues of a fish. 
Preferred tissues for expression are skin, muscle, eye and bone. 

30 Substitutions, Additions and Deletions 

As possible variants of the above specifically exemplified polypeptides, the 
polypeptide may have additional individual amino acids or amino acid sequences inserted 
into the polypeptide in the middle thereof and/or at the N-ierminal and/or C-terminal ends 
thereof so long as the polypeptide possesses the desired physical and/or biological 
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characteristics. Likewise, some of the amino acids or amino acid sequences may be deleted 
from the polypeptide so long as the polypeptide possesses the desired physical 
characteristics. Amino acid substitutions may also be made in the sequences so long as the 
polypeptide possesses the desired physical and biochemical characteristics. DNA coding 
5 for these variants can be used to prepare gene constructs of the present invention. 

Sequence Identity' at the Amino Acid Level 

The variants of pol>TDeptides contemplated herein should possess more than 75% 
sequence identity (sometimes referred to as homology), preferably more than 85% identity, 
most preferably more than 95% identity, even more preferably more than 98% identity to 
10 the naturally occurring and/or specifically exemplified polypeptides or fragments thereof 
described herein. To determine this homology, two polypeptides are aligned so as to obtain 
a maximum match using gaps and inserts. 

Two sequences are said to be "identical" if the sequence of residues is the same 
when aligned for maximum correspondence as described below. The term 
15 "complementary" applies to nucleic acid sequences and is used herein to mean that the 
sequence is complementar>' to all or a portion of a reference polynucleotide sequence. 

Optimal alignment of sequences for comparison can be conducted by the local 
homology algorithm of Smith and Waterman (1981), by the homology alignment method 
' of Needleman and Wurlsch (1970), by the search for similarity method of Pearson and 
20 Lippman (1988), or the like. Computer implementations of the above algorithms are 
known as part of the Genetics Computer Group (GCG) Wisconsin Genetics Software 
Package (GAP, BESTFIT, BLASTA, FASTA and TFASTA), 575 Science Drive, Madison, 
WI. 

"Percentage of sequence identity" is determined by comparing two optimally 
25 aligned sequences over a comparison window, wherein the portion of the sequence in the 
comparison window may comprise additions or deletions {i.e. "gaps") as compared to the 
reference sequence for optimal alignment of the two sequences being compared. The 
percentage identity is calculated by determining the number of positions at which the 
identical residue occurs in both sequences to yield the number of matched positions, 
30 dividing the number of matched positions by the total number of positions in the window 
and multiplying the result by 100 to yield the percentage of sequence identity. Total 
identity is then determined as the average identity over all of the windows that cover the 
complete query sequence. 
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Fragments of Polypeptide 

Genes which code for fragments of the fall length polypeptides such as proteolytic 
cleavage fragments which contain at least one, and preferably alL of the above-listed 
physical and/or biological properties are also encompassed by the present invention. 

5 DNA and RNA 

The invention encompasses DNA that codes for any one of the above-described 
polypeptides including, but not limited to, those shown in SEQ ID NOS:2, 4, and 6, 
including fusion polypeptides, variants and fragments thereof. The sequence of certain 
particularly useful cDNAs which encode polypeptides are shown in SEQ ID NOS:l, 3 and 
10 5. The present invention also includes cDNA as well as genomic DNA containing or 
comprising the requisite nucleotide sequences as well as corresponding RNA and antisense 
sequences. 

Cloned DNA within the scope of the invention also includes allelic variants of the 
specific~sequences presented in-the-attached -Sequence -Listing. -An "allelic variant" is a 
sequence that is a variant from that of the exemplified nucleotide sequence, but represents 
the same chromosomal locus in the organism. In addition to those wWch occur by normal 
genetic variation in a population and perhaps fixed in the population by standard breeding 
methods, allelic, variants can be produced by genetic engineering methods.. A .preferred 
allelic variant is one that is found in a naturally occurring organism, including a laboratory 
strain. Allelic variants are either silent or expressed. A silent allele is one that does not 
affect the phenotype of the organism. An expressed allele results in a detectable change in 
the phenotype of the trait represented by the locus. 

A nucleic acid sequence "encodes" or "codes for" a polypeptide if it directs the 
expression of the polypeptide referred to. The nucleic acid can be DNA or RNA. Unless 
otherwise specified, a nucleic acid sequence that encodes a polypeptide includes both the 
transcribed strand and the mRNA or the DNA representative of the mRNA. An "antisense" 
nucleic acid is one that is complementary to all or part of a strand representative of mRNA, 
including untranslated portions thereof. 

Degenerate Sequences 

30 In accordance vAlh degeneracy of genetic code, it is possible to substitute at least 

one base of the base sequence of a gene by another kind of base without causing the amino 
acid sequence of the polypeptide produced from the gene to be changed. Hence, the DNA 
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of the present invention may also have any base sequence that has been changed by 
substitution in accordance with degeneracy of genetic code. 

DNA Modification 

The DNA is readily modified by substitution, deletion or insertion of nucleotides, 
5 thereby resulting in novel DNA sequences encoding the polypeptide or its derivatives.. 
These modified sequences are used to produce mutant po]}Tpeptide and to directly express 
the polypeptide. Methods for saturating a particular DNA sequence with random mutations 
and also for making specific site directed mutations are known in the art; see e.g. 
Sambrook etai (1989J., 

1 0 Hybridizable Variants 

The DNA molecules useful in accordance with the present invention can comprise a 
nucleotide sequence selected from the group consisting of SEQ ID NOS.:l, 3, 5 and 7-19, 
or can comprise a nucleotide sequence that hybridizes to a DNA molecule comprising the 
nucleotide sequence of SEQ ID NOS.:l, 3 or 5 under salt and temperature conditions 

15 providing-stringency-at -least as high as that equivalentAo 5x S.SC and 42?C and that codes 

on expression for a polypeptide that has one or more or all of the above-described physical 
and/or biological properties. The present invention also includes polypeptides coded for by 
these hybridizable variants. The relationship of stringency to hybridization and wash 
conditions and other considerations of hybridization can be found in Chapters 1 1 arid 12 of 

20 Sambrook et al (1989). The present invention also encompasses functional promoters 
which hybridize to SEQ ID NOS:7, 8 or 9 under the above-described conditions. DNA 
molecules of the invention will preferably hybridize to reference sequences under more 
stringent conditions allovmig the degree of mismatch represented by the degrees of 
sequence identity enumerated above. The present invention also encompasses functional 

25 primers or linker oligonucleotides set forth in SEQ ID NOS:10-19 or larger primers 
comprising these sequences, or sequences which hybridize with these sequences under the 
above-described conditions. The primers usually have a length of 10-50 nucleotides, 
preferably 15-35 nucleotides, more preferably 18-30 nucleotides. 

Vectors 

30 The invention is further directed to a replicable vector containing cDNA that codes 

for the polypeptide and that is capable of expressing the polypeptide. 

The present invention is also directed to a vector comprising a replicable vector and 
a DNA sequence corresponding to the above described gene inserted into said vector. The 
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vector may be an integrating or non-integrating vector depending on its intended use and is 
conveniently a plasmid. 

Transformed Cells 

The invention further relates to a transformed cell or microorganism containing 
5 cDNA or a vector which codes for the polypeptide or a fragment or variant thereof and that 
is capable of expressing the polypeptide. 

Expression Systems Using Vertebrate Cells 

Interest has been great in vertebrate cells, and propagation of vertebrate cells in 
culture (tissue culture) has become a routine procedure. Examples of vertebrate host cell 

1 0 lines useful in the present invention preferably include cells from any of the fish described 
herein. Expression vectors for such cells ordinarily include (if necessary) an origin of 
replication, a promoter located upstream from the gene to be expressed, along wdth a 
ribosome binding site, RNA splice site (if intron-containing genomic DNA is used or if an 
intron is necessary to optimize expression of a cDNA), a polyadenylation site, and a 

1 5 transcription termination sequence. 

EXAMPLES 

The following examples are provided by way of illustration only and not. by way of. 
limitation. Those of skill will readily recognize a variety of noncritical parameters which 
can be changed or modified to yield essentially similar results. 

20 Example I: Isolation of skin-specific, muscle-specific and ubiquitously expressed 
zebrafish cDNA clones. 

cDNA clones were isolated and sequenced as described by Gong et al. (1997). 
Basically, random cDNA clones were selected from zebrafish embryonic and adult cDNA 
libraries and each clone was sequenced partially by a single sequencing reaction. The 
25 partial sequences were then used to identify the sequenced clones for potential function and 
tissue specificity. Out of 261 distinct clones identified by this approach, three of them were 
selected for skin specificity (clone A39 encoding cytokeratin, CK), for muscle specificity 
(clone El 46 encoding muscle creatine kinase, MCK), and for ubiquitous expression (clone 
A150 encoding acidic ribosomal protein PO, ARP), respectively. 

30 The three cDNA clones were then sequenced completely and their complete cDNA 

sequences with deduced amino acid sequences are shown in SEQ ID NOS:l, 3 and 5, 
respectively. A39 encodes a type II basic cytokeratin and its closest homolog in mammals 
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is cytokeratin 8 (65-68% amino acid identity). El 46 codes for the zebrafish MCK and its 
amino acid sequence shares -87% identity with mammalian MCKs, The amino acid 
sequence of zebrafish ARP deduced from the A 150 clone is 87-89% identical to those of 
mammalian ARPs. 

5 To demonstrate their expression patterns, whole mount in situ hybridization was 

carried out for developing embryos and Northern blot analyses were carried out for 
selected adult tissues and for developing embryos. 

As indicated by whole mount in situ hybridization, cytokeratin mRNA was 
specifically expressed in the embryonic surface (Figs. 1 A-IC ) and cross section of in situ 

10 hybridized embryos confirmed that the expression was only in skin epithelia (Fig. IC). 
Ontogenetically, the cytokeratin mRNA appeared before 4 hpf and it is likely that the 
transcription of the cytokeratin gene starts at mid-blastula transition when the zygotic 
genome is activated. By in situ hybridization, a clear cytokeratin mRNA signal was 
detected in highly flattened cells of the superficial layer in blastula and the expression 

15 remained in the superficial layer which eventually developed into skin epithelia including 

the yolk sac. In adult tissues, cytokeratin mRNA was predominantly detected in the skin 

and also weakly in several other tissues including the eye, gill, intestine and muscle, but 
not in the liver and ovary (Fig. 2). Therefore, the cjiokeratin mRNA is predominantly, if 
not specifically, expressed in skin cells. 

20 MCK mRNA was first detected in the first few anterior somites in 10 somite stage 

embryos (14 hpf) and at later stages the expression is specifically in skeletal muscle (Fig. 
ID) and in heart (data not shown). When the stained embryos are cross-sectioned, the 
MCK mRNA signal was found exclusively in the trunk skeletal muscles (Fig. IE). In adult 
tissues, MCK mRNA was detected exclusively in the skeletal muscle (Fig. 2). 

25 ARP mRNA was expressed ubiquitously and it is presumably a maternal mRNA 

since it is present in the ovary as well as in embryos at one cell stage. In in situ 
hybridization experiments, an intense hybridization signal was detected in most tissues. An 
example of a hybridized embryo at 28 hpf is shown in Fig. 1 F. In adults, ARP mRNA was 
abundantly expressed in all tissues examined except for the brain where a relatively weak 

30 signal was detected (Fig. 2). These observations confirmed that the ARP mRNA is 
expressed ubiquitously. 

Example II: Isolation of zebrafish gene promoters 

Three zebrafish gene promoters were isolated by a linker-mediated PCR method as 
described by Liao et aL, (1997) and as exemplified by the diagrams in Fig. 3. TTie whole 
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procedure includes the following steps: 1) designing of gene specific primers; 2) isolation 
of zebrafish genomic DNA; 3) digestion of genomic DNA by a restriction enzyme; 4) 
ligation of a short linker DNA to the digested genomic DNA; 5) PGR amplification of the 
promoter region; and 6) DNA sequencing to confirm the cloned DNA fragment. The 
5 following is the detailed description of these steps, 

1. Designing of gene specific primers 

Gene specific PGR primers were designed based on the 5' end of the three cDNA 
sequences and the region used for the sequences are shown in Figs. 1-3. 

The two cytokeratin gene specific primers are: 

10 GKl (SEQIDNOilO) 

CK2 (SEQ ID NO:l 1), where the first six nucleotides are for creation of an EcoRI site to 
facilitate cloning. 

The two muscle creatine kinase gene specific primers are: 

MGKl (SEQ ID NO: 12), where the first five nucleotides are for creation of an EcoRI site 
15 to facilitate cloning. 

MGK2 (SEQ ID NO: 13), where the first three nucleotides are for creation of an EcoRI site 
to facilitate cloning. 

The two acidic ribosomal protein PO gene specific primers are: 
ARPl (SEQ ID NO: 14) 

20 ARP2 (SEQ ID NO: 15), where the first six nucleotides are for creation of an EcoRI site to 
facilitate cloning. 

2. Isolation of zebrafish genomic DNA 

Genomic DNA was isolated fi-om a single individual fish by a standard method 
(Sambrook et aL, 1989). Generally, an adult fish was quickly fi-ozen in liquid nitrogen and 
groimd into powder. The ground tissue was then transferred to an extraction buffer (10 mM 

Tris, pH 8, 0.1 M EDTA, 20 ^ig/ml RNase A and 0.5% SDS) and incubated at Sl^C for 1 
hour. Proteinase K was added to a final concentration of 100 |-ig/ml and gently mixed until 
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the mixture appeared viscous, followed by incubation at 50^C for 3 hours with periodical 
swirling. The genomic DNA was gently extracted three times by phenol equilibrated with 
Tris-HCl (pH 8), precipitated by adding 0.1 volume of 3 M NaOAc and 2,5 volumes of 
ethanoU and collected by swirling on a glass rod, then rinsed in 70% ethanol. 

5 3. Digestion of genomic DNA bv a restriction enzyme 

Genomic DNA was digested with the selected restriction enzymes. Generally, 500 
units of restriction enzyme were used to digest 50 ^g of genomic DNA overnight at the 
optimal enzyme reaction temperature (usually at 37^C). 

4. Li gation of a short linker DNA to the digeste d genomic DNA 

10 The linker DNA was assembled by annealing equal moles of the two linker 

oligonucleotides, Oligol (SEQ ID NO:16) and Oligo 2 (SEQ ID NO:17). Oligo 2 was 
phosphorylated by T4 polynucleotide kinase prior to annealing. Restriction enzyme 
digested genomic DNA was filled-in or trimmed with T4 DNA polymerase, if necessary, 
~ ~' arid ligated" with th'e~l inker DNAv Ligation was-performed-witb 1 ^ig of digested genomic 
15 DNA and 0.5 ^ig of linker DNA in a 20 ^il of reaction containing 10 units of T4 DNA 
ligase at 4^C overnight. 

5. PGR amplification of promoter region 

PGR was performed wdth Advantage Tth Polymerase Mix (Glontech). The first 
roimd of PGR was performed using a linker specific primer LI (SEQ ID NO: 18) and a 

20 gene specific primer Gl (GKl, MCKl or ARPl), Each reaction (50 ^il) contains 5 \xl of 
lOx Tth PGR reaction buffer (1X= 15 mM KOAc, 40 mM Tris, pH 9.3), 2.2 ^il of 25 mM 
Mg(OAc)2, 5 111 of 2 mM dNTP, 1 |al of LI (0.2 ^ig/^l), 1 1^1 of Gl (0.2 ^g/jtl) , 33.8 |il of 
H2O, and 1 nl (50 ng) of linker ligated genomic DNA and 1 |il of 50x Tth polymerase mix 
(Glontech). The cycling conditions were as follows: 940G/1 min, 35 cycles of 94OG/30 sec 

25 and 6S^C/6 min, and finally 6S^C/S min. After the primary round of PGR was completed, 
the products were diluted 100 fold. One \x\ of diluted PGR product was used as template 
for the second round of PGR (nested PGR) with a second linker specific primer L2 (SEQ 
ID NO: 19) and a second gene specific primer G2 (GK2, MGK2 or ARP2), as described for 
the primary PGR but v^th the following modification: 94<5G/1 min, 25 cycles of 94^0/30 

30 sec and 68OG/6 min, and finally 68OG/8 min. Both the primary and secondary PGR 
products were analyzed on a 1% agarose gel. 
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6. DNA sequencing to confirm the cloned DNA fragment 



PGR products were purified fi-om the agarose gel following electrophoresis and 
cloned into a TA vector, pT7Blue (Novogen). DNA sequencing was performed by 
dideoxynucleotide chain termination method using a T7 Sequencing Kit purchased from 
5 Pharmacia. Complete sequences of these promoter regions were obtained by automatic 
sequencing using a dRhodamine Terminator Cycle Sequencing Ready Reaction Kit 
(Perkin-Elmer) and an ABI 377 automatic sequencing machine. 

The isolated cytokeratin gene promoter is 2.2 kb. In the 3' proximal region 
immediately upstream of a portion identical to the 3' part of the CK cDNA sequence, there 

10 is a putative TATA box perfectly matching to a consensxis TATA box sequence. The 164 
bp of the 3* region is identical to the 5* UTR (untranslated region) of the cytokeratin cDNA. 
Thus, the isolated fi-agment was indeed derived fi-om the same gene as the cytokeratin 
cDNA clone (SEQ ID NO:7). Similarly, a 1.5 kb 5* flanking region was isolated fi-om the 
muscle creatine kinase gene, a putative TATA box was also found in its 3* proximal region 

15 and the 3' region is identical to the 5' portion of the MCK cDNA clone (SEQ ID NO:8). A 
2.2 kb fragment was amplified for the ARP gene. By alignment of its sequence with the 
ARP cDNA clone, we found a 1.3 kb intron in the 5* UTR (SEQ ID NO:9). As a result, the 
isolated ARP promoter is only about 0.8 kb long. 

Example III: Generation of green fluorescent transgenic fish 

20 The isolated zebrafish gene promoters were inserted into the plasmid pEGFP-1, 

which contains an EGFP structxiral gene whose codons have been optimized according to 
preferable htiman codons. All of the three promoter fi-agments were inserted into pEGFP-1 
at the EcoRI and BamHI site and the resulting recombinant plasmids were named pCK- 
EGFP (Fig. 4), pMCK-EGFP (Fig. 5), and pARP-EGFP, respectively (Fig. 6). 

25 Linearized plasmid DNAs at a concentration of 500 |ag/ml (for pCK-EGFP and 

pMCK-EGFP) in 0,1 M Tris-HCl (pH 7.6)70.25% phenol red were injected into the 
cytoplasm of 1- or 2-cell stage embryos. Because of a high mortality rate, pARP-EGFP 
was injected at a lower concentration (50 |ag/ml). Each embryo received 300-500 pi of 
DNA. The injected embryos were reared in autoclaved Holtfi-eter's solution (0.35% NaCl, 

30 0.01% KCl and 0.01% CaCl2) supplemented with 1 |ig/ml of methylene blue. Expression 
of GFP was observed and photographed under a ZEISS Axiovert 25 fluorescence 
microscope. 
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When zebrafish embryos received pCK-EGFP, GFP expression started about 4 
hours after injection, which corresponds to the stage of -30% epiboly. About 55% of the 
injected embryos expressed GFP at this stage. The early expression was always in the 
superficial layer of cells, mimicking endogenous expression of the CK gene as observed by 
5 in situ hybridization. At later stages, in all GFP-expressing fish, GFP was found 
predominantly in skin epithelia. A typical GFP transgenic zebrafish fry at 4 days old is 
shown in Fig. 7. 

Under the MCK promoter, no GFP expression was observed in early embryos 
before muscle cells become differentiated. By 24 hpf, about 12% of surviving embryos 
10 expressed GFP strongly in muscle cells and these GFP-positive embryos remain GFP- 
positive after hatching. The GFP expression was always found in many bundles of muscle 
fibers, mainly in the mid-trunk region and no expression was ever found in other types of 
cells. A typical GFP transgenic zebrafish fiy (3 days old) is shown in Fig. 8. 

Expression of ARP-EGFP was fu-st observed 4 hours after injection at the 30% 
15 epiboly stage. The timing of expression is similar to that of pCK-EGFP-injected embryos. 
However, unlike the CK-EGFP transgenic embryos, the GFP expression under the ARP 
promoter occurred not only in the superficial layer of cells but also in deep layers of cells. 
In some batches of injected embryos, ahnost 100% of the injected embryos expressed 
initially. At later stages when some embryonic cells become overtly differentiated, it was 
20 found that the GFP expression occurred essentially in all different types of cells such as 
skin epithelia, muscle cells, lens, neural tissues, notochord, circulating blood cells and yolk 
cells (Fig. 9). 

Example IV: Potential applications of fluorescent transgenic fish 

The fluorescent transgenic fish have use as omamental fish in the market. Stably 
25 transgenic lines can be developed by breeding a GFP transgenic individual with a wild type 
fish. By isolation of more zebrafish gene promoters, such as eye specific, bone specific, tail 
specific etc., and/or by classical breeding of these transgenic zebrafish, more varieties of 
fluorescent transgenic zebrafish can be produced. Previously, we have reported isolation of 
over 200 distinct zebrafish cDNA clones homologous to known genes (Gong et al,, 1997). 
30 These isolated clones code for proteins in a variety of tissues and some of them are 
inducible by heat-shock, heavy metals, or hormones such as estrogens; thus, this work 
provided rich resources to isolate tissue-specific and inducible promoters according to the 
method described in the present invention. 
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Multiple color fluorescent fish may be generated by the same technique as blue 
fluorescent protein (BFP) gene, yellow fluorescent protein (YFP) gene and cyan 
fluorescent protein (CFP) gene are available from Clonetech. For example, a transgenic 
fish with GFP under an eye specific promoter, BFP under a skin specific promoter, and 
5 YFP under a muscle specific promoter will show the following multiple fluorescent colors: 
green eyes, blue skin and yellow muscle. By recombining different tissue specific 
promoters and fluorescent protein genes, more varieties of transgenic fish of different 
fluorescent color patterns will be created. By expression of two or more different 
fluorescent proteins in the same tissue, an intermediate color may be created. For example, 
10 expression of both GFP and BFP under a skin specific promoter, a dark-green skin color 
may be created. 

By using a heavy metal-inducible or hormone- (such as estrogen or other steroid 
hormone) inducible promoter, a biosensor system may be developed for monitoring 
enviroimiental pollution. In such a biosensor system, the transgenic fish will glow with a 

15 green fluorescence (or other color depending on the fluorescence protein gene used) when 
pollutants such as heavy metals and estrogens (or their derivatives) reach a threshold 
concentration in an aquatic environment. Such a biosensor system has obvious advantages 
over classical analytical methods because it is rapid, visualizable, and capable of 
identifying specific compounds directly in complex mixture found in an aquatic 

20 environment, and is portable or less instrument dependent. Moreover, the biosensor system 
also provides direct information on biotoxicity and it is biodegradable and regenerative. 

In addition, the fluorescent transgenic fish should also be valuable in the market for 
scientific research tools because they can be used for embryonic studies such as tracing cell 
lineage and cell migration. Cells from transgenic fish expressing GFP can also be used as 
25 cellular and genetic markers in cell transplantation and nuclear transplantation 
experiments. 

The chimeric gene constructs demonstrated successfully in zebrafish in the present 
invention should also be applicable to other fish species such as medaka, goldfish, carp 
including koi, loach, tilapia, glassfish, catfish, angel fish, discus, eel, tetra, goby, gourami, 

30 guppy, Xiphophorus (swordtail), hatchet fish, Molly fish, pangasius, etc. The promoters 
described herein can be used directly in these fish species. Alternatively, the homologous 
gene promoters can be isolated by the method described in this invention. For example, the 
isolated and characterized zebrafish cDNA clones and promoters described in this 
invention can be used as molecular probes to screen for homologous promoters in other 

35 fish species by molecular hybridization or by PGR. Alternatively, one can first isolate the 
zebrafish cDNA and promoters based on the sequences presented in SEQ ID NOS:l, 3, 5 
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and 7-9 by PGR and then use the zebrafish gene fragments to obtain homologous genes 
from other fish species by the methods mentioned above. 
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SEQUENCE LISTING 

<110> GONG, Zhiyuan 
LAM, Toong Jin 
JU, Bensheng 
XU, Yanfei 
HE, Jiangyan 
VAN, Tie 

<120> CHIMERIC GENE CONSTRUCTS FOR GENERATION OF FLUORESCENT 
TRANSGENIC ORNAMENTAL FISH 

<130> ARC- PAP135 

<140> 
<141> 

<160> 19 

<170> Patentin Ver. 2.0 

<210> 1 

<211> 2480 

<212> DNA 

<213> Danio rerio 

<220> 

<221> CDS 

<222> (90) . . (1586) 

<220> 

<221> primer_bind 
<222> (66) . . (85) 

<^"23> CK2 " ' ■ ■ * ■ ' ■ ■ ■ * ^ . - . V 

<220> 

<221> primer_bind 
<222> (97) . . (120) 
<223> CKl 
<220> 

<221> polyA_signal 
<222> (2446) (2451) 

<400> 1 

ctctcctttg tgagcaacct cctccactca ctcctctctc agagagcact ctcgtacctc 60 

cttctcagca actcaaagac acaggcatc atg tea acc agg tct ate tet tac 113 

Met Ser Thr Arg Ser lie Ser Tyr 
1 5 

tec age ggt ggc tec ate agg agg gge tac acc age cag tea gee tat 161 

Ser Ser Gly Gly Ser lie Arg Arg Gly Tyr Thr Ser Gin Ser Ala Tyr 
10 15 20 

gca gta cct gee ggc tct acc agg atg age tea gtg ace agt gtc agg 209 

Ala Val Pro Ala Gly Ser Thr Arg Met Ser Ser Val Thr Ser Val Arg 
25 30 35 40 
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aga tct ggt gtg ggt gcc age cca ggc ttc ggt gcc ggt ggc age tac 

Arg Ser Gly Val Gly Ala Ser Pro Gly Phe Gly Ala Gly Gly Ser Tyr 

45 50 55 

age ttt age age age age atg ggt gga ggc tat gga agt ggt ctt ggt 

Ser Phe Ser Ser Ser Ser Met Gly Gly Gly Tyr Gly Ser Gly Leu Gly 

60 65 70 

gga ggt cte ggt ggg ggc atg ggc ttt cgt tgc ggg ctt ect ate aea 

Gly Gly Leu Gly Gly Gly Met Gly Phe Arg Cys Gly Leu Pro lie Thr 

75 80 85 

get gta act gtc aac cag aac ctg ttg gcc ecc tta aac ctg gaa ate 

Ala Val Thr Val Asn Gin Asn Leu Leu Ala Pro Leu Asn Leu Glu lie 
90 95 100 

gac ccc aca att caa get gtc cgc act tea gag aaa gag cag att aag 

Asp Pro Thr lie Gin Ala Val Arg Thr Ser Glu Lys Glu Gin lie Lys 

105 110 115 120 



ace 



257 



305 



353 



401 



449 



ttc aac aac cge ttc get ttc etc ate gac aaa gtg cgc ttc ctg 497 



Thr Phe Asn Asn Arg Phe Ala Phe Leu lie Asp Lys Val Arg Phe Leu 
125 130 135 

gaa cag cag aac aag atg ctt gag ace aaa tgg agt ctt etc caa gaa 545 

Glu Gin Gin Asn Lys Met Leu Glu "Thr Lys Trp Ser Leu Leu -Gin Glu 
140 145 150 

cag aca acc aca cgt tec aac ate gat gee atg ttt gag gca tac ate 593 

Gin Thr Thr Thr Arg Ser Asn He Asp Ala Met Phe Glu Ala Tyr He 
155 160 165 

tct aac ctg cgc aga cag etc gat gga ctg gga aat gag aag atg aag 641 

Ser Asn Leu Arg Arg Gin Leu Asp Gly Leu Gly Asn Glu Lys Met Lys 
170 175 180 

ctg gag gga gag ctg aag aac atg cag ggc ctg gtt gag gac ttc aag 689 

Leu Glu Gly Glu Leu Lys Asn Met Gin Gly Leu Val Glu Asp Phe Lys 
185 190 195 200 

aac aag tac gag gat gag ate aac aag cgt get tec gta gag aat gag 737 

Asn Lys Tyr Glu Asp Glu He Asn Lys Arg Ala Ser Val Glu Asn Glu 
205 210 215 

ttt gtc ctg etc aag aag gat gtt gat gca gcc tac atg aac aag gtt 785 

Phe Val Leu Leu Lys Lys Asp Val Asp Ala Ala Tyr Met Asn Lys Val 
220 225 230 
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gag ctt gaa gcc aag gtt gat get ctt cag gat gag ate aac ttc etc 833 

Glu Leu Glu Ala Lys Val Asp Ala Leu Gin Asp Glu lie Asn Phe Leu 
235 240 245 

agg gca gtc tac gag get gaa etc egg gag etc eag tct eag ate aag 881 

Arg Ala Val Tyr Glu Ala Glu Leu Arg Glu Leu Gin Ser Gin lie Lys 
250 255 260 

gac aca tct gtt gtt gta gaa atg gae aac age aga aac ctg gat atg 929 

Asp Thr Ser Val Val Val Glu Met Asp Asn Ser Arg Asn Leu Asp Met 

265 270 275 280 

gac tec ate gtg get gaa gtt cgc get cag tat gaa gac ate gcc aac 977 

Asp Ser lie Val Ala Glu Val Arg Ala Gin Tyr Glu Asp lie Ala Asn 
285 290 295 

cgc age cgt gcc gag gca gag age tgg tac aaa cag aag ttt gag gag 1025 

Arg Ser Arg Ala Glu Ala Glu Ser Trp Tyr Lys Gin Lys Phe Glu Glu 
300 305 310 

atg cag age ace get ggt cag tat ggt gat gac etc cgc tea aca aag 1073 

Met Gin Ser Thr Ala Gly Gin Tyr Gly Asp Asp Leu Arg Ser Thr Lys 
315 320 325 

get gag att get gaa etc aac cgc atg ate gee cgc ctg cag aac gag 1121 

Ala Glu lie Ala Glu- Leu Asn ' Arg Met lie Ala Arg Leu Gin Asn .Glu 
330 335 340 

ate gat get gtc aag gca eag cgt gee aac ttg gag get cag att get 1169 

lie Asp Ala Val Lys Ala Gin Arg Ala Asn Leu Glu Ala Gin lie Ala 

345 350 355 360 

gag get gaa gag cgt gga gag ctg gca gtg aag gat gcc aag etc cgc 1217 

Glu Ala Glu Glu Arg Gly Glu Leu Ala Val Lys Asp Ala Lys Leu Arg 

365 370 375 

ate agg gag ctg gag gaa get ctt cag agg gee aag eaa gac atg gcc 1265 

lie Arg Glu Leu Glu Glu Ala Leu Gin Arg Ala Lys Gin Asp Met Ala 
380 385 390 

cgc cag gtc cgc gag tac cag gag etc atg aac gtc aaa ttg get ctg 1313 

Arg Gin Val Arg Glu Tyr Gin Glu Leu Met Asn Val Lys Leu Ala Leu 
395 400 405 

gac att gag ate gcc ace tac agg aaa ctg ttg gaa gga gag gag age 1361 

Asp lie Glu lie Ala Thr Tyr Arg Lys Leu Leu Glu Gly Glu Glu Ser 
410 415 420 
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aga ctg tec age ggt gga get caa get ace att cat gtt cag cag ace 1409 

Arg Leu Ser Ser Gly Gly Ala Gin Ala Thr lie His Val Gin Gin Thr 
425 430 435 440 

tec gga ggt gtt tea tct ggt tat ggt ggt age ggc tct ggt ttc ggc 1457 

Ser Gly Gly Val Ser Ser Gly Tyr Gly Gly Ser Gly Ser Gly Phe Gly 
445 450 455 

tac age agt ggc tte age agt ggt ggg tea gga tae ggt agt gga tea 1505 

Tyr Ser Ser Gly Phe Ser Ser Gly Gly Ser Gly Tyr Gly Ser Gly Ser 
460 465 470 

gga ttc ggt tct gga tea ggg tat ggt gga ggc tec ate age aaa ace 1553 

Gly Phe Gly Ser Gly Ser Gly Tyr Gly Gly Gly Ser lie Ser Lys Thr 
475 480 485 

agt gtc ace ace gtc age agt aaa cge tat taa ggagaagccc gcccaaaecc 1606 

Ser Val Thr Thr Val Ser Ser Lys Arg Tyr 
490 495 



ceagccgaca 


cagtttccaa 


cettccttae 


ctgcaactag 


atcccttetg 


aaccttctta 


1666 


egacteaaae 


catetatggt 


gctatatttt 


ageeagaeag 


etgtceectg 


tt aatgagga 


1726 


gatgtggacg 


atgattttta 


aagtacaaaa 


taagttttag 


attgttetgt 


gtgttgatgg 


1786 


tagttacccg 


tatcatgcat 


ctcetgtctg 


gtggtgtcac 


tgecatttta 


aatcatcaac 


1846 


ecaactacae 


taaaacgata 


eeaggaagaa 


tcgtgctcca 


agecactgaa 


tagtettatt 


1906 


tctgeactga 


tatgtaeagg 


gaaagtgaga 


eacatagaaa 


ccactgtaac 


ctaegtagta 


1966 


ctatggttte 


actggatcag 


gggtgtgcta 


taeaagttec 


tgaatgtctt 


gtttgaatgt 


2026 


tttgtgctgt 


taeaagetee 


ctgctgtagt 


tttgctgact 


aatctgactt 


ttgteatttt 


2086 


getatggetg 


teagagttgg 


tttacctatt 


ttctataaaa 


tgtatatggc 


agtcagccaa 


2146 


taaetgatga 


caattgcttg 


tgggctacta 


atgtecagtt 


acetcacatt 


caagggagat 


2206 


ctgttacagc 


aaaaaacagg 


eaeaatggga 


tttatgtgga 


ccatccctee 


ttaaccttgt 


2266 


gtactttecg 


tgttggaagt 


ggtgactgta 


ctgecttaca 


cattcccctg 


tattcaactg 


2326 


gcttccagag 


catattttac 


atcceeggtt 


ataaatggaa 


aatgcaagaa 


aaetgaaaea 


2386 


atgtteaacc 


agatttattt 


ggtattgatt 


gaegagaeac 


caaettgaaa 


tttgaataca 


2446 


ataaatctga 


gaccaeaaaa 


aaaaaaaaaa 


aaaa 






2480 



<210> 2 
<211> 498 
<212> PRT 
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<213> Danio rerio 



<400> 2 

Met Ser Thr Arg Ser He Ser Tyr Ser Ser Gly Gly Ser He Arg Arg 
15 10 15 

Gly Tyr Thr Ser Gin Ser Ala Tyr Ala Val Pro Ala Gly Ser Thr Arg 
20 25 30 

Met Ser Ser Val Thr Ser Val Arg Arg Ser Gly Val Gly Ala Ser Pro 
35 40 45 

Gly Phe Gly Ala Gly Gly Ser Tyr Ser Phe Ser Ser Ser Ser Met Gly 
50 55 60 

Gly Gly Tyr Gly Ser Gly Leu Gly Gly Gly Leu Gly Gly Gly Met Gly 
65 10 75 80 

Phe Arg Cys Gly Leu Pro He Thr Ala Val Thr Val Asn Gin Asn Leu 
85 90 95 

Leu Ala Pro Leu Asn Leu Glu He Asp Pro Thr He Gin Ala Val Arg 
100 105 110 

Thr Ser Glu Lys Glu Gin He Lys Thr Phe Asn Asn Arg Phe Ala Phe 
115 120 125 

Leu He Asp Lys Val Arg Phe Leu Glu Gin Gin Asn Lys Met Leu Glu 
130 135 140 

Thr Lys Trp Ser Leu Leu Gin Glu Gin Thr Thr Thr Arg Ser Asn He 
145 150. - 155 - 160 

Asp Ala Met Phe Glu Ala Tyr He Ser Asn Leu Arg Arg Gin Leu Asp 
165 170 175 

Gly Leu Gly Asn Glu Lys Met Lys Leu Glu Gly Glu Leu Lys Asn Met 
180 185 190 

Gin Gly Leu Val Glu Asp Phe Lys Asn Lys Tyr Glu Asp Glu He Asn 
195 200 205 

Lys Arg Ala Ser Val Glu Asn Glu Phe Val Leu Leu Lys Lys Asp Val 
210 215 220 

Asp Ala Ala Tyr Met Asn Lys Val Glu Leu Glu Ala Lys Val Asp Ala 
225 230 235 240 

Leu Gin Asp Glu He Asn Phe Leu Arg Ala Val Tyr Glu Ala Glu Leu 
245 250 255 

Arg Glu Leu Gin Ser Gin He Lys Asp Thr Ser Val Val Val Glu Met 
260 265 270 

Asp Asn Ser Arg Asn Leu Asp Met Asp Ser He Val Ala Glu Val Arg 
275 280 285 



Ala Gin Tyr Glu Asp He Ala Asn Arg Ser Arg Ala Glu Ala Glu Ser 
290 295 300 
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Trp Tyr Lys Gin Lys Phe Glu Glu Met Gin Ser Thr Ala Gly Gin Tyr 
305 310 315 320 

Gly Asp Asp Leu Arg Ser Thr Lys Ala Glu lie Ala Glu Leu Asn Arg 
325 330 335 

Met lie Ala Arg Leu Gin Asn Glu lie Asp Ala Val Lys Ala Gin Arg 
340 345 350 

Ala Asn Leu Glu Ala Gin lie Ala Glu Ala Glu Glu Arg Gly Glu Leu 
355 360 365 

Ala Val Lys Asp Ala Lys Leu Arg lie Arg Glu Leu Glu Glu Ala Leu 
370 375 380 

Gin Arg Ala Lys Gin Asp Met Ala Arg Gin Val Arg Glu Tyr Gin Glu 
385 390 395 400 

Leu Met Asn Val Lys Leu Ala Leu Asp lie Glu lie Ala Thr Tyr Arg 
405 410 415 

Lys Leu Leu Glu Gly Glu Glu Ser Arg Leu Ser Ser Gly Gly Ala Gin 
420 425 430 

Ala Thr lie His Val Gin Gin Thr Ser Gly Gly Val Ser Ser Gly Tyr 
435 440 - 445 

Gly Gly Ser Gly Ser Gly Phe Gly Tyr Ser Ser Gly Phe Ser Ser Gly 
450 455 460 

Gly Ser Gly Tyr Gly Ser Gly Ser Gly Phe Gly Ser Gly Ser Gly Tyr 
465 470 475 480 

Gly Gly Gly Ser lie Ser Lys Thr Ser Val Thr Thr Val Ser Ser Lys 
485 490 495 

Axg Tyr 



<210> 3 

<211> 1589 

<212> DNA 

<213> Danio rerio 



<220> 

<221> CDS 

<222> (86) . . (1231) 



<220> 



<221> 
<222> 



prime r_bind 
(6) . . (26) 
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<223> MCK2 
<220> 

<221> primer_bind 
<222> (20) . . (38) 
<223> MCKl 

<220> 

<221> polyA_signai 
<222> (1534) . . (1539) 

<400> 3 

cctatttcgg cttggtgaac aggatctgat cccaaggact gttaccactt ttgttgtctt 60 

ttgtgcagtg ttagaaaccg caatc atg cct ttc gga aac acc cac aac aac 112 

Met Pro Phe Gly Asn Thr His Asn Asn 
1 5 



ttc aag ctg aac tac tea gtt gat gag gag tat cca gac ctt age aag 

Phe Lys Leu Asn Tyr Ser Val Asp Glu Glu Tyr Pro Asp Leu Ser Lys 
10 15 20 25 



cac aac 



aac cac atg gcc aag gtg ctg act aag gaa atg tat ggc aag 



160 



208 



His Asn Asn His Met Ala Lys Val Leu Thr Lys Glu Met Tyr Gly Lys 
30 35 40 

ctt agg gac aag cag acc cca cct gga ttc act gtg gat gat gtc ate 

Leu Arg Asp Lys Gin Thr Pro Pro Gly Phe Thr Val Asp Asp Val lie 
45 50 55 

eag act ggt gtt gac aat cca ggc cac ccc ttc ate atg ace gtc gge 

Gin Thr Gly Val Asp Asn Pro Gly His Pro Phe He Met Thr Val Gly 
60 65 70 

tgt gtt get ggt gat gag gag tec tac gat gtt ttc aag gae ctg ttc 

Cys Val Ala Gly Asp Glu Glu Ser Tyr Asp Val Phe Lys Asp Leu Phe 
75 80 85 

gac ccc gtc att tec gac cgt cac ggt gga tac aag gca act gac aag 

Asp Pro Val He Ser Asp Arg His Gly Gly Tyr Lys Ala Thr Asp Lys 
90 95 100 105 

cac aag acc gac etc aac ttt gag aac ctg aag ggt ggt gat gac ctg 448 



256 



304 



352 



400 
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His Lys Thr Asp Leu Asn Phe Glu Asn Leu Lys Gly Gly Asp Asp Leu 

110 115 120 

gac ccc aac tac ttc ctg age age cgt gtg cgt acc gga cgc age ate 496 

Asp Pro Asn Tyr Phe Leu Ser Ser Arg Val Arg Thr Gly Arg Ser lie 

125 130 135 

aag gga tac ccc ctg ccc ccc cac aac age cgt gga gag cgc aga get 544 

Lys Gly Tyr Pro Leu Pro Pro His Asn Ser Arg Gly Glu Arg Arg Ala 
140 145 150 

gtg gag aag ctg tet gtt gaa get ctg agt age ttg gat gga gag ttc 592 

Val Glu Lys Leu Ser Val Glu Ala Leu Ser Ser Leu Asp Gly Glu Phe 
155 160 165 

aag ggc aag tac tac ccc ctg aag tec atg act gat gac gag cag gag 64 0 

Lys Gly Lys Tyr Tyr Pro Leu Lys Ser Met Thr Asp Asp Glu Gin Glu 

170 175 180 185 

cag ctg ate get gac cac ttc etc ttt gac aaa ccc gtc tec ccc ctg 68 8 

Gin Leu lie Ala Asp His Phe Leu Phe Asp Lys Pro Val Ser Pro Leu 

190 195 200 

ctg ctg get get ggt atg gee cgt gac tgg cee gat gee aga ggc att 736 

Leu Leu Ala Ala Gly Met Ala Arg Asp Trp Pro Asp Ala Arg Gly lie 

205 210 215 

tgg cac aat gag aac aaa gee ttc ctg gtc tgg gtg aaa cag gag gat 784 

Trp His Asn Glu Asn Lys Ala Phe Leu Val Trp Val Lys Gin Glu Asp 
220 225 230 

cac ctg cgt gtc att tec atg cag aag ggt ggc aac atg aag gaa gtg 832 

His Leu Arg Val lie Ser Met Gin Lys Gly Gly Asn Met Lys Glu Val 
235 240 245 

ttc aag cgc ttc tgc gtt ggt ctt cag agg att gag gaa att ttc aag 880 

Phe Lys Arg Phe Cys Val Gly Leu Gin Arg lie Glu Glu lie Phe Lys 

250 255 260 265 

aag cac aac cat ggg ttc atg tgg aac gag cat ctt ggt ttc gtc ctg 928 

Lys His Asn His Gly Phe Met Trp Asn Glu His Leu Gly Phe Val Leu 

270 275 280 

ace tgc ccc tec aac ctg ggc aca ggc ctg cgc ggt gga gtc cac gtc 97 6 

Thr Cys Pro Ser Asn Leu Gly Thr Gly Leu Arg Gly Gly Val His Val 

285 290 295 

aag ctg ccc aag etc age aca cat gee aag ttt gag gag ate ctg acc 1024 
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Lys Leu Pro Lys Leu Ser Thr His Ala Lys Phe Glu Glu lie Leu Thr 
300 305 310 

aga ctg cgc ctg cag aag cgt ggc aca ggg ggt gtg gac ecc get tec 1072 

Arg Leu Arg Leu Gin Lys Arg Gly Thr Gly Gly Val Asp Thr Ala Ser 
315 320 325 

gtt ggt gga gtg ttt gac att tec aac get gac cgt ate ggc tet tea 1120 

Val Gly Gly Val Phe Asp lie Ser Asn Ala Asp Arg lie Gly Ser Ser 
330 335 340 345 

gag gtt gag cag gtg cag tgt gtg gtt gat ggt gtc aag ctg atg gtg 1168 

Glu Val Glu Gin Val Gin Cys Val Val Asp Gly Val Lys Leu Met Val 
350 355 360 

gag atg gag aag aag ctg gga gaa ggc cag tec ate gac age atg ate 1216 

Glu Met Glu Lys Lys Leu Gly Glu Gly Gin Ser lie Asp Ser Met lie 
365 370 375 

cct gee cag aag taa agcgggaggc ccttecattt ttttcttcgt ctttgtctgt 1271 

Pro Ala Gin Lys 
380 

ttttttaeag tceaacagea aegsagagga aaactgctge tcaaaaagac agtctcacet 1331 

ttgeaectgt ettctttcct ttttttcect tcttetctaa tttecatgtc atttegccat 1391 

etttttttee actttgtttc ctattaagtc ggtaacatet tgggatcaga tacceggsgc 1451 

aggagtgagt gettgttgct gaggettcac ctcaatttca gcettggttg taaaaagtga 1511 

ateaatcaaa gttgtatttc aaaataaaaa tccceaataa aaaaaaaaaa aaaaaaaaaa 1571 

aaaaaaaaaa aaaaaaaa 1589 

<210> 4 

<211> 381 

<212> PRT 

<213> Danio rerio 

<400> 4 

Met Pro Phe Gly Asn Thr His Asn Asn Phe Lys Leu Asn Tyr Ser Val 
15 10 15 

Asp Glu Glu Tyr Pro Asp Leu Ser Lys His Asn Asn His Met Ala Lys 

20 25 30 

Val Leu Thr Lys Glu Met Tyr Gly Lys Leu Arg Asp Lys Gin Thr Pro 
35 40 45 
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Pro Gly Phe Thr 
50 

Gly His Pro Phe 
65 

Ser Tyr Asp Val 



His Gly Gly Tyr 
100 

Glu Asn Leu Lys 
115 

Ser Arg Val Arg 
130 

His Asn Ser Arg 
145 

Ala Leu Ser Ser 



Lys Ser Met Thr 
180 

Leu Phe Asp Lys 
195 

Arg Asp Trp Pro 
210 

Phe Leu Val Trp 
225 

Gin Lys Gly Gly 



Leu Gin Arg lie 
260 

Trp Asn Glu His 
275 

Thr Gly Leu Arg 
290 

His Ala Lys Phe 
305 

Gly Thr Gly Gly 



Ser Asn Ala Asp 
340 

Val Val Asp Gly 

355 



Val Asp Asp Val 
55 

lie Met Thr Val 
70 

Phe Lys Asp Leu 
85 

Lys Ala Thr Asp 



Gly Gly Asp Asp 
120 

Thr Gly Arg Ser 
135 

Gly Glu Arg Arg 
150 

Leu Asp Gly Glu 
165 

Asp Asp Glu Gin 



Pro Val Ser Pro 
200 

Asp Ala Arg Gly 
215 

Val Lys Gin Glu 
230 

Asn Met Lys Glu 
245 

Glu Glu lie Phe 



Leu Gly Phe Val 
280 

Gly Gly Val His 
295 

Glu Glu lie Leu 
310 

Val Asp Thr Ala 
325 

Arg He Gly Ser 



Val Lys Leu Met 
360 



He Gin Thr Gly 
60 

Gly Cys Val Ala 
75 

Phe Asp Pro Val 
90 

Lys His Lys Thr 
105 

Leu Asp Pro Asn 



He Lys Gly Tyr 
140 

Ala Val Glu Lys 
155 

Phe Lys Gly Lys 
170 

Glu Gin Leu He 
185 

Leu Leu Leu Ala 



He Trp His Asn 
220 

Asp His Leu Arg 
235 

Val Phe Lys Arg 
250 

Lys Lys His Asn 
265 

Leu Thr Cys Pro 



Val Lys Leu Pro 
300 

Thr Arg Leu Arg 
315 

Ser Val Gly Gly 
330 

Ser Glu Val Glu 
345 

Val Glu Met Glu 



Val Asp Asn Pro 



Gly Asp Glu Glu 
80 

He Ser Asp Arg 
95 

Asp Leu Asn Phe 
110 

Tyr Phe Leu Ser 
125 

Pro Leu Pro Pro 



Leu Ser Val Glu 
160 

Tyr Tyr Pro Leu 
175 

Ala Asp His Phe 
190 

Ala Gly Met Ala 
205 

Glu Asn Lys Ala 



Val He Ser Met 
240 

Phe Cys Val Gly 
255 

His Gly Phe Met 
270 

Ser Asn Leu Gly 
285 

Lys Leu Ser Thr 



Leu Gin Lys Arg 
320 

Val Phe Asp He 
335 

Gin Val Gin Cys 
350 

Lys Lys Leu Gly 
365 
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Glu Gly Gin Ser lie Asp Ser Met lie Pro Ala Gin Lys 
370 375 380 



<210> 5 

<211> 1104 

<212> DNA 

<213> Danio rerio 

<220> 

<22I> CDS 

<222> (75) . . (1034 ) 

<220> 

<221> primer^bind 
<222> (45) . . (64 ) 
<223> ARP2 

<220> 

<221> primer_bind 
<222> (87) . . (112) 
<223> ARPl 

<220> 

<221> polyA_signal 
<222> (1069) , . (1074) 

<400> 5 

cgcgtcccta ccgtgagatt ttacaacctt gtctttaaac cggctgttca ccgatccttg 60 

gaagcactgc aaag atg ccc agg gaa gac agg gcc acg tgg aag tec aac 110 

Met Pro Arg Glu Asp Arg Ala Thr Trp Lys Ser Asn 
15 10 

tat ttt ctg aaa ate ate caa ctg ctg gat gac tte ccc aag tgt tte 158 

Tyr Phe Leu Lys lie lie Gin Leu Leu Asp Asp Phe Pro Lys Cys Phe 

15 20 25 

ate gtg ggc gea gae aat gtc ggc tec aag cag atg eag ace ate egt 206 
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Ile Vol Gly Ala Asp Asn Val Gly Ser Lys Gin Met Gin Thr lie Arg 

30 35 40 

ctg tec ctg egg ggc aag gcc gtc gtg etc atg ggg aaa aac acc atg 254 

Leu Ser Leu Arg Gly Lys Ala Val Val Leu Met Gly Lys Asn Thr Met 

45 50 55 60 

atg agg aag gcc att cgt ggc cac ctg gaa aac aac cca get ctg gag 302 

Met Arg Lys Ala lie Arg Gly His Leu Glu Asn Asn Pro Ala Leu Glu 

65 70 75 

agg ctg ctt cec cac ate cgc ggg aac gtg ggc ttc gtc ttc acc aag 350 

Arg Leu Leu Pro His lie Arg Gly Asn Val Gly Phe Val Phe Thr Lys 
80 85 90 

gag gat ctg act gag gtc cga gac ctg ctg ctg gca aac aaa gtg ccc 398 

Glu Asp Leu Thr Glu Val Arg Asp Leu Leu Leu Ala Asn Lys Val Pro 

95 100 105 

get get gcc cgt get ggt gee ate gcc ccc tgt gag gtg act gtg ccg 446 

Ala Ala Ala Arg Ala Gly Ala lie Ala Pro Cys Glu Val Thr Val Pro 

110 115 120 

gcc cag aac ace ggg etc ggt cct gag aag acc tct ttc ttc cag get 4 94 

Ala Gin Asn Thr Gly Leu Gly Pro Glu Lys Thr Ser Phe Phe Gin Ala 

125 130 135 140 

ttg gga ate acc acc aag ate tec aga gga ace att gaa ate ttg agt 542 

Leu Gly lie Thr Thr Lys lie Ser Arg Gly Thr lie Glu lie Leu Ser 

145 150 155 

gac gtt cag ctt ate aaa cct gga gac aag gtg ggc gee age gag gcc 590 

Asp Val Gin Leu lie Lys Pro Gly Asp Lys Val Gly Ala Ser Glu Ala 
160 165 170 

acg ctg ctg aac atg ctg aac atg ctg aac ate teg ccc ttc tee tac 638 

Thr Leu Leu Asn Met Leu Asn Met Leu Asn lie Ser Pro Phe Ser Tyr 

175 180 185 

ggg ctg ate ate cag cag gtg tat gat aac ggc agt gtc tac age ccc 686 

Gly Leu lie lie Gin Gin Val Tyr Asp Asn Gly Ser Val Tyr Ser Pro 

190 195 200 

gag gtg ctg gac ate act gag gac gcc ctg cac aag agg ttc ctg aag 734 

Glu Val Leu Asp lie Thr Glu Asp Ala Leu His Lys Arg Phe Leu Lys 

205 210 215 220 

ggt gtg agg aac ate gcc agt gtg tgt ctg cag ate ggc tac cca act 782 



# 



-37- 



830 



878 



926 



974 



1022 



1074 



Gly Val Arg Asn He Ala Ser Val Cys Leu Gin He Gly Tyr Pro Thr 
225 230 235 

ctt get tec ate cct cac act ate ate aat gga tac aag age gtc ctg 

Leu Ala Ser He Pro His Thr He He Asn Gly Tyr Lys Arg Val Leu 
240 245 250 

get gtc act gtc gaa aca gac tac aca ttc ccc ttg get gag aag gtg 

Ala Val Thr Val Glu Thr Asp Tyr Thr Phe Pro Leu Ala Glu Lys Val 
255 260 265 

aag gcc tac ctg get gat ccc acc get ttc get gtt gea gee cct gtt 

Lys Ala Tyr Leu Ala Asp Pro Thr Ala Phe Ala Val Ala Ala Pro Val 
270 275 280 

geg gca get aca gag cag aaa tec get get cct gcg get aaa gag gag 

Ala Ala Ala Thr Glu Gin Lys Ser Ala Ala Pro Ala Ala Lys Glu Glu 
285 290 295 300 

gca ccc aag gag gat tet gag gag tet gat gaa gac atg ggc ttc ggc 

Ala Pro Lys Glu Asp Ser Glu Glu Ser Asp Glu Asp Met Gly Phe Gly 
305 310 315 

ctg ttt gat taa accagacacc gaatatccat gtctgtttaa catcaataaa 
Leu Phe Asp 

320 

acatctggaa acaaaaaaaa aaaaaaaaaa 1104 

<210> 6 

<211> 319 

<212> PRT 

<213> Danio rerio 



<400> 6 

Met Pro Arg Glu Asp Arg Ala Thr Trp Lys Ser Asn Tyr Phe Leu Lys 
15 10 15 

He He Gin Leu Leu Asp Asp Phe Pro Lys Cys Phe He Val Gly Ala 
20 25 30 

Asp Asn Val Gly Ser Lys Gin Met Gin Thr He Arg Leu Ser Leu Arg 
35 40 45 

Gly Lys Ala Val Val Leu Met Gly Lys Asn Thr Met Met Arg Lys Ala 
50 55 60 

He Ara Gly His Leu Glu Asn Asn Pro Ala Leu Glu Arg Leu Leu Pro 
65 " 70 75 80 
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His lie Arg Gly Asn Val Gly Phe Val Phe Thr Lys Glu Asp Leu Thr 
85 90 95 

Glu Val Arg Asp Leu Leu Leu Ala Asn Lys Val Pro Ala Ala Ala Arg 
100 105 110 

Ala Gly Ala lie Ala Pro Cys Glu Val Thr Val Pro Ala Gin Asn Thr 
115 120 125 

Gly Leu Gly Pro Glu Lys Thr Ser Phe Phe Gin Ala Leu Gly lie Thr 
130 135 140 

Thr Lys lie Ser Arg Gly Thr lie Glu lie Leu Ser Asp Val Gin Leu 
145 150 155 160 

lie Lys Pro Gly Asp Lys Val Gly Ala Ser Glu Ala Thr Leu Leu Asn 
165 170 175 

Met Leu Asn Met Leu Asn lie Ser Pro Phe Ser Tyr Gly Leu lie lie 
180 185 190 

Gin Gin Val Tyr Asp Asn Gly Ser Val Tyr Ser Pro Glu Val Leu Asp 
195 200 205 

lie Thr Glu Asp Ala Leu His Lys Arg Phe Leu Lys Gly Val Arg Asn 
210 215 220 

lie Ala Ser Val Cys Leu Gin He Gly Tyr Pro Thr Leu Ala Ser He 
225 230 235 240 

Pro His Thr He He Asn Gly Tyr Lys Arg Val Leu Ala Val Thr Val 
245 250 255 

Glu Thr Asp Tyr Thr Phe Pro Leu Ala Glu Lys Val Lys Ala Tyr Leu 
260 265 270 

Ala Asp Pro Thr Ala Phe Ala Val Ala Ala Pro Val Ala Ala Ala Thr 
275 280 285 

Glu Gin Lys Ser Ala Ala Pro Ala Ala Lys Glu Glu Ala Pro Lys Glu 
290 295 300 

Asp Ser Glu Glu Ser Asp Glu Asp Met Gly Phe Gly Leu Phe Asp 
305 310 315 



<210> 7 

<211> 2241 

<212> DNA 

<213> Danio rerio 

<220> 

<221> TATA signal 



<222> (2103) . . (2108) 



-39- 



<220> 

<221> primer_binQ 

<222> (2221) . . (2241) 

<223> CK2 

<220> 

<221> inisc_f eature 

<222> (2142) . . (2235) 

<223> Identical to the 5' CK cDNA 



<400> 7 



ccttcccttc 


tacttttgac 


gtccttttaa 


gagcttgtgc 


atgaaagcag 


atttggagct 


60 


gat tact cat 


ctcaaacacc 


catacaaagg 


gatgattgcc 


gtaccatgat 


ctcacacctt 


120 


tcacacctgg 


tttatactat 


gatagttgta 


gacgattgcg 


taatgctatt 


aaatgcccat 


180 


cagtgct ggc 


tgtgacaccc 


aactgctgcc 


atttcgtgtt 


gacttgcacg 


agaaatgaga 


240 


aattgtctga 


ctatgcaggg 


tgtctatgcg 


tgggaacatt 


tatcagtggt 


cattaaatac 


300 


tatagtttac 


agttagacca 


aagtgtgctg 


tatttttgtg 


ttagcttagc 


tgcagttttt 


360 


gtgtgtgaag 


taacaaatga 


caaatactca 


aactattgta 


attaagtagt 


ttttctcaga 


420 


aattgtaatt 


tactaagtag 


tttaaaaatg 


tgtactttta 


ctttcccttg 


agtacatttt 


480 


tagtgcagtg 


ttggtacttt 


tatttcactt 


ccttccttca 


acctgcagtc 


actactttat 


540 


ttattcttgt 


ctatgtggat 


tagacaaatc 


agtcctgtga 


ttcctgtcca 


atcaaattgc 


600 


acatagaagg 


taaatcacat 


cataatgaac 


taccttaaga 


catgggccat 


ttataattgc 


660 


agcaaactgt 


ttgccagcat 


taaaagaaga 


tgtcaaaaat 


atttacacgc 


attaacccag 


720 


agactgctta 


gatgcatgtc 


actgatgaga 


agatgatgga 


tgtttactgt 


atgatgaccg 


780 


aaataacttt 


aaacgcacac 


aagacggcac 


aagacgtcaa 


catggcgtta 


ggttgacgtt 


840 


gtaccccaac 


gcagtgggga 


cgttgcattt 


tgtttagaaa 


tgaaaattag 


gttgacgtca 


900 


gaactcaacg 


tcaggtcgat 


gtcaatgttc 


aacatccaat 


ctaaaatcat 


atatcaatgt 


960 


ctaa tgatgt 


tacagcttga 


tgttatgcgg 


atgttacccc 


tatgacgtct 


atcagacgtt 


1020 


ggattatggt 


tgccatacct 


gatgaataaa 


tgtcattatt 


tgacgttggt 


ttaagatgtt 


1080 


ggttcgacat 


tggattttgg 


tcgctttcca 


acacaaccta 


aatccaccaa 


atattaactt 


1140 


cctatgacat 


cgttat tgga 


cgtcaaaata 


acaat at cct 


tagetgctgg 


ctagactttg 


1200 
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aatttaggtc accacaacct atatttaacc taatattaac atcttatgat gttgtgtgcc 1260 

tgctgggcaa taactaaatg cactacagaa tgttacgttt acacacatgt aaattacatg 1320 

taaatgcatc agcttttcac agcataatac tcactactta ctactcttga gtacttttaa 1380 

aaaagctact tttcactcat actttgagta atatttacaa ctgatacttt tactcgcact 1440 

acatttttag gcatgtattg atatttttac tatgattttt cagtactctt tccactactg 1500 

cagccctccc catacataat cgtatgttta cacatatggt ggagtttaga gccataatct 1560 

acattagctt tgttagccgc tagcattact gtgcagaatt gtgtgtgtgc acattttcca 1620 

atatcaatac agaaggaaac tgtgttccct gttcccttgt aaatctcaac aatgcaactg 1680 

ttcagctcag ggggaaaaat gccctgccag atccaaacgg ctggcaaaag tgaatggaaa 1740 

aaagcctttc attaatgtga aagttgctgc gcgccccacc cagataaaaa gagcagaggt 1800 

taacatgctc tctacggctg tccagccaac cagatactga ggcagaaaca cacccgctgg 1860 

cagatggtga gagctacact gtcttttcca gagtttctac tggaatgcct gtcctcaagt 1920 

ctcaagcctc tccttgcatt ctctcattcc acctggggca aagccccagg ctgggtgtga 1980 

caacatttat cttaccactt tctctctgta cctgtctaac aggtagggtg tgtgtgagag 2040 

tgcgtatgtg tgcaagtgcg tgtgtgtgtg agagcagtca gctccaccct ctcaagagtg 2100 

tgtataaaat tggtcagcca gctgctgaga gacacgcaga gggactttga ctctcctttg 2160 

tgagcaacct cctccactca ctcctctctc agagagcact ctcgtacctc cttctcagca 2220 

actcaaagac acaggatccg g 2241 

<210> 8 

<211> 1456 

<212> DNA 

<213> Danio rerio 

<220> 

<221> TATA_signal 
<222> (1389) . . (1394) 

<220> 

<221> primer_bind 

<222> (1433) . . (1456) 

<223> MCK2 
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<220> 

<221> misc_f eature 
<222> (1428) . . (1453) 

<223> Identical to the 5* MCK cDNA 



<400> 8 



gaattgcaaa 


gtcagagtaa 


taaaatgaaa 


ccaaaaaaca 


tLUUT-aoauei 




fin 


tgtggcttaa 


tcttggctga 


tgtgtgtgtg 


tgtgtgtgtg 


tacucgacag 


OLgv_uciy uy<3 




gcatgtgcac 


catgacaggc 


ctgttattca 


cacttggtgc 


cat.g xnggag 


dcugu LL.yyo 


X O VJ 


cagctatagt 


tttcttcaca 


gagtcctggg 


tcacctaatg 


ccacaaggcia. 


r-T — J -a a -a 




acatgttaaa 


atgtgacatt 


caaattgtag 


^ _ ^ A 4- 4- A m4- ^ 

tgcattactt 


aacgaaacgc 


oL. Uca.(— cl(_clciy 




ttacagctta 


aaagattgct 


agacagaaaa 


accagggagg 


ggt 1 1 1 ccca 


u a a I, a c ca g 


J V) VJ 


tgagactcta 


ggagcgggaa 


cactaacagg 


cctccctgag 


tgagaacatt 


gcatgugcgc 


4 9 n 

*i ^ VJ 


gtgacagaaa 


accagagatg 


gaaatacctt 


ct tttgaat t 


gcaT_aaL.L.gc 


•f- 4- o :s ^ a :a 
l_ Uaacicigcsoy 




acacaacagg 


gatagttcac 


ccaaaaaaca 


ga ccatt ctt 




yoaOadooo i_ 


s 4 n 


taagatatt t 


tgaagaatgc 


ttaccgaata 


act t ccatat 


1 1 ggaaact a 


outacaguga 


D U 


aagtcaatgg 


gtcttccagc 


atttttticaa 


4- ~ 4- ^ — , — 4~ '4~ s 
L.aX_aCCT_T-aC 




d a a u ci ci o ci a 


660 


acatctcaaa 


taggtttgag 


gttgaataaa 


_ _ ^ ^ 4- 4- 4- ^ 4- 

ca tttt t can 


ttrggggi-gg 




720 


attatttgac 


acttaagatt 


tatagtaaat 


ca tttt atag 






7 80 


atggttgaat 


ttatcttcat 


gtttatgtct 


gggttgtgct 


tttttgaaaa 


gatttccctg 


840 


tcaaatgttt 


ttgtgtatgg 


ttggcgcaca 


atagactgaa 


ctggcctatc 


acacagactt 


900 


tcataacaac 


tccagttgat 


gccctttcac 


cctcagtgta 


taaatatggc 


gtctgacatg 


960 


agcagattaa 


acacgacact 


gcaacaactt 


tacctgtaaa 


aatacaaatt 


gagtttgcac 


1020 


ccagaatcat 


gtggtgaacg 


aagcctacca 


agagattttt 


gaaagccatc 


ggcctgacac 


1080 


gcgcacttct 


gatatctgtg 


gtatgtttgg 


caaaagtgct 


gctcagcctt 


tttagcatgg 


1140 


cagatcctcc 


acatcccatc 


acccctcctt 


caacctattc 


cctcctggaa 


agctatgtat 


1200 


ggggcgggaa 


gtgtaaatgg 


atatgggaag 


gaaggggggc 


accacccaca 


gctgccacct 


1260 


catctaggat 


gcctggggcc 


t aaattgaag 


cctttcttac 


actaaacagg 


gcataagaga 


1320 


ccagcgccag 


ccaat cataa 


t tcagtgagc 


tctaaaatgg 


gccagccaat 


ggctgcaggg 


1380 


gctagagcta 


tatatatcca 


aat caaactc 


ttcttgcttg 


ggtgacccct 


atttcggctt 


1440 
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ggtgaacagg atccgg 1456 

<210> 9 

<211> 2205 

<212> DNA 

<213> Danio rerio 

<220> 

<221> priiner_bind 
<222> (2179) . . (2205) 
<223> ARP2 

<220> 

<221> niisc_f eature 

<222> (2153) . . (2199) 

<223> Identical to the 5' ARP cDNA 
<220> 

<221> intron 

<222> (792) . . (2152) 

<220> 

<221> misc_f eature 
<222> (775) . . (791) 

<223> Identical to the 5' ARP cDNA 
<400> 9 

atctgtatta agaaacactt aaaatatata tgcgttacga attaaaaaca aaacacgatc 60 
attttaattt gtgttgtata attttacatt ttgtaagtat tatttttata aaaaatatat 120 
agaaataata caaatttgtt tacagtattc ttagttattg caataaacga attttatata 180 
gaaagagaaa gagttttatt ataagatgtt caatttaaaa aatggcagaa aatagaaaaa 240 
tgattgtcaa gatgataaaa gtcagtttag acaaaaaaat aagatgaaaa acatcaaaat 300 
agataataaa gtgacttttt tgggcggacc aaatttccct attaatggtc aattcattaa 360 



aatacattca 
gtgcagattt 
atttacattc 
tcaaccctca 
tatattgacc 
cagacacgtg 
atgatcgact 
cctaccgtga 
taacatcata 
tatgcgttta 
caggcctcgt 
tccgattctg 
ttacatgcta 
ttttaaaacc 
tcaat cacat 
tttatattag 
gtaaagttaa 
tcaaatctta 
accataatac 
ttcaaaatat 
tgcataaatc 
gcagtggtca 
taattaaaca 
gcaggtgtgt 
gtgacccctg 
t tatggacag 
cttctatccc 
atagcagtaa 
t aggcaacca 
aagat gcaca 
accttgtctt 



ttaaaataaa 
ttggctgttg 
tttgctatta 
cagtaatgta 
gcagctgtat 
cattgaccaa 
gatactaata 
ggtaaggctg 
agaggtcacg 
aagcttgtgt 
gcgccatgtg 
atattaatca 
gtacttgtgc 
agttactctc 
ataattgtgt 
tatgtacaat 
tcttaaaggg 
tgagtttctt 
aaatacagga 
ctacacaagt 
agttttcatt 
ccaagcttgt 
cacctgaaca 
tgatgcaaga 
cctcaagcca 
gctgttgcag 
tgtctgtctg 
atcaaataca 
aattacctgg 
caaggcaggt 
taaaccggct 



ggtatrgcga 
ttagaaggga 
aattatccat 
aaaat attat 
cctttctaag 
tcagcgcaca 
ttgtgccgct 
acgccgctct 
agaaggtcta 
aatgattttt 
tgacgcgacg 
tatttatgcg 
tagtcggtcg 
attttagtga 
ttatgtttta 
ttggcataaa 
gtaaaggctc 
aatgaacatg 
aaaatatact 
gtttaatgga 
tgggtgagct 
tcctgaaggg 
agctaatcaa 
tagagctaaa 
tcacaaatgc 
tgcttgttcg 
catctcatga 
atagtgctct 
aaacagttta 
gtaaaagtat 
gttcaccgat 
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tgaat ttaga 
tacatctgcg 
tatttgtatt 
ttattgtttt 
tgcgactgta 
gatacgcatt 
tcctttcgcg 
tgtggcggtt 
cgtgtgttta 
acagtaaaag 
ttttaatagc 
taaaatgtgt 
atccacattg 
aatattctta 
tttgagtcat 
ctgccttcgg 
acccaaaaga 
gtatgttttg 
atagaagtcg 
aggaactcaa 
gtctctaaac 
ccagtgtcct 
ggtcttacta 
ccctgcaggg 
attatggtat 
tcgttcccac 
cttgcaggga 
gattatcttt 
caaacagtaa 
tgcttgtgtt 
ccttggaagg 



tgcacagtga 
gccgaaagtt 
ttattacccc 
atagcgtcac 
caaatacgca 
ttccgcgcga 
gcctctttct 
tcttaaaatg 
atatcagcgg 
ttagcactag 
atcttatttg 
gatgggtctg 
agatgttgcg 
agccactaag 
cataccaggt 
ttttgattga 
caat tcaccg 
gagaaaactg 
atggttacag 
gtgatttgaa 
atttgattta 
acagatttta 
ggtatgtttg 
acaatggccc 
taagaaatgt 
tgcacaaatg 
cgctggtctc 
aaatatttga 
ttcatatttt 
tgtaatcctc 
gatcc 



ttttggttct 
aacgggaact 
aaccgtaaac 
agaatgatgc 
ctgaccgtga 
ttctgattgg 
ttcacgcgtc 
tgttaataaa 
cggttattat 
cctgttagca 
attttgatga 
ctagtggaca 
ctatttgcca 
ttaaaatttg 
aatagtttta 
catctacttt 
tcaagtgttt 
gaaaccaact 
gttttctgca 
aagttaaggg 
gacacctcag 
gctccaaccc 
aaacatccag 
aacaggattg 
gcaggttcag 
aacatgattc 
agacacgttt 
aagcttataa 
gtcatttaat 
agattttaca 



420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2205 



-44- 



<210> 10 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Cytokeratin 
gene specific primer 

<400> 10 

cgctggagta agagatagac ctgg 
<210> 11 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Cytokeratin 
gene specific primer 

<220> 

<221> misc_feature 
<222> (1) . . (6) 

<223> Introduced for restriction site 
<220> 

<221> misc_feature 
<222> (3) . . (8) 
<223> BamHI site 



<400> 11 

ccggatcctg tgtctttgag ttgctg 



26 



m 
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<21Q> 12 
<211> 24 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Muscle 
creatine kinase gene specific primer 

<220> 

<221> misc__f eature 
<222> (3) . . (8) 
<223> BamHI site 

<400> 12 

ccggatcctt gggatcagat cctg 

<210> 13 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Muscle 
creatine kinase gene specific primer 

<220> 

<221> misc_feature 
<222> (1) . . (3) 

<223> Introduced for restriction site 



<220> 



# 
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<221> misc_feature 
<222> (3) . . (8) 
<223> BamHI site 

<400> 13 

ccggatcctg ttcaccaacc cgaa 

<210> 14 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Acidic 
ribosomal protein PO gene specific primer 

<400> 14 

tagttggact tccacgtgcc ctgtc 

<210> 15 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Acidic 
ribosomal protein PO gene specific primer 

<220> 

<221> misc_feature 
<222> (1) . . (7) 

<223> Introduced for restriction site 
<220> 
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<221> misc_f eature 
<222> (1) . . (6) 
<223> BamHI site 

<400> 15 

ggatcccttc caaggatcgg tgaaca 

<210> 16 
<211> 51 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Oligonucleotide for linker used in linker-mediated 
PCR 

<400> 16 

gttcatcttt acaagctagc gctgaacaat gctgtggaca agcttgaatt c 

<210> 17 
<211> 10 
<212> DNA 

<213> Artificial Sequence 



<223> Description of Artificial Sequence: 

Oligonucleotide for linker used in linker-mediated 



<220> 



PCR 



<220> 

<223> n is a dideoxycyt idine 
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<400> 17 

gaattcaagn 10 

<210> 18 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: linker 
specific primer 

<400> 18 

gttcatcttt acaagctagc g 21 

<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: linker 
specific primer 

<400> 19 

tcctgaacaa tgctgtggac 20 
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CLAIMS 

1. A zebrafish cytokeratin gene promoter which is capable of directing a 
structxiral gene to be predominantly expressed in skin epithelia when it is inserted in front 
of the structural gene and introduced into fish embryos. 

2. A zebrafish muscle creatine kinase gene promoter which is capable of 
directing a structural gene to be specifically expressed in muscles when it is inserted in 
front of the structural gene and introduced into fish embryos. 

3. A zebrafish acidic ribosomal protein PO gene promoter which is capable of 
directing a structural gene to be expressed ubiquitously in all tissues when it is inserted in 
front of the structural gene and introduced into fish embryos. 

4. A recombinant DNA molecule comprising a structural gene and the 
promoter of claim 1, 2 or 3 arranged upstream of said promoter. 

5. A chimeric gene comprising the promoter of claim 1, 2 or 3, operatively 
linked to DNA encoding a protein selected from the group consisting of GFP, modified 
GFP, EGFP, BFP, EBFP, YFP, EYFP, CFP, ECFP, luciferase, B-galactosidase, and 
chloramphenicol acety transferase . 

6. A transgenic fish comprising a chimeric gene comprising the promoter of 
claim 1, 2 or 3. 

7. The transgenic fish of claim 6, which contains said promoter in germ cells 
and/or in somatic cells and which is capable of breeding with either a said transgenic fish 
or a non-transgenic fish to produce viable and fertile transgenic progeny. 

8. The transgenic fish of claim 6, and progeny of said fish that emits green 
fluorescence under a blue light. 

9. A transgenic fish comprising a DNA that encodes a fluorescent protein 
under control of a promoter that causes said DNA (1) to be expressed in predominately 
skin epithelia, (2) to be specifically expressed in muscles or (3) to be expressed 
ubiquitously in all tissues. 

10. The transgenic fish of claim 9, wherein said promoter is a promoter which 
naturally occurs in non-transgenic fish of the same species as the transgenic fish. 



-SO- 
IL A recombinant DNA vector comprising a promoter DNA that hybridizes 
under stringent conditions to a polynucleotide of any one of SEQ ID NOS:7, 8 or 9, 
operatively linked to a structural gene encoding a fluorescent or chemiluminescent protein. 

12. A cell transformed with the vector of claim 1 1 . 

13. A transgenic fish comprising a chimeric gene in turn comprising a promoter 
DNA that hybridizes under stringent conditions to a polynucleotide of any one of SEQ ID 
NOS:7, 8 or 9, operatively linked to a structural gene encoding a fluorescent or a 
chemiluminescent protein. 

14. A method for sensing a steroid hormone or a steroid hormone derivative in a 
water sample comprising: 

(a) contacting a fish expressing a fluorescent or chemiluminescent 
protein under control of an estrogen- or other steroid hormone-inducible promoter with a 
sample of water; and 

(b) measuring the amount of fluorescent or chemiluminescent light from 

said fish. 
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ABSTRACT 

CHIMERIC GENE CONSTRUCTS FOR GENERATION OF FLUORESCENT 
TRANSGENIC ORNAMENTAL FISH 

Three zebrafish gene promoters, which are skin specific, muscle specific and ubiquitously 
expressed respectively, were isolated and ligated to the 5' end of the EGFP gene. WTien the 
resulting chimeric gene constructs were introduced into zebrafish, the transgenic zebrafish 
emit green fluorescence under a blue light according to the specificity of the promoters 
used. Thus, new varieties of ornamental fish of different fluorescence patterns, e.g., skin 
fluorescence, muscle fluorescence, and/or ubiquitous fluorescence, are developed. 



(FIG. A is to be published) 
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BamHI 
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EcoRV BamHI 
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FIG. 5 
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Pstl BamHI 




FIG. 6 
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FIG . 8 



